// Licensed to the Apache Software Foundation (ASF) under one or more
// contributor license agreements.  See the NOTICE file distributed with
// this work for additional information regarding copyright ownership.
// The ASF licenses this file to You under the Apache License, Version 2.0
// (the "License"); you may not use this file except in compliance with
// the License.  You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
= Evaluator

Apache Ignite ML comes with a number of machine learning algorithms that can be used to learn from and make predictions on data. When these algorithms are applied to build machine learning models, there is a need to evaluate the performance of the model on some criteria, which depends on the application and its requirements. Apache Ignite ML also provides a suite of classification and regression metrics for the purpose of evaluating the performance of machine learning models.

== Classification model evaluation

While there are many different types of classification algorithms, the evaluation of classification models all share similar principles. In a supervised classification problem, there exists a true output and a model-generated predicted output for each data point. For this reason, the results for each data point can be assigned to one of four categories:

* True Positive (TP) - label is positive and prediction is also positive
* True Negative (TN) - label is negative and prediction is also negative
* False Positive (FP) - label is negative but prediction is positive
* False Negative (FN) - label is positive but prediction is negative

Especially, these metrics are important for binary classification.

CAUTION: Multiclass classification evalution is not supported yet in Apache Ignite ML.

The full list of binary classification metrics supported in Apache Ignite ML is next:

* Accuracy
* Balanced accuracy
* F-Measure
* FallOut
* FN
* FP
* FDR
* MissRate
* NPV
* Precision
* Recall
* Specificity
* TN
* TP

The explanation and formulas for these metrics can be found https://en.wikipedia.org/wiki/Evaluation_of_binary_classifiers[here].


[source, java]
----
// Define the vectorizer.
Vectorizer<Integer, Vector, Integer, Double> vectorizer = new DummyVectorizer<Integer>()
   .labeled(Vectorizer.LabelCoordinate.FIRST);

// Define the trainer.
SVMLinearClassificationTrainer trainer = new SVMLinearClassificationTrainer();

// Train the model.
SVMLinearClassificationModel mdl = trainer.fit(ignite, dataCache, vectorizer);

// Calculate all classification metrics.
EvaluationResult res = Evaluator
  .evaluateBinaryClassification(dataCache, mdl, vectorizer);

double accuracy = res.get(MetricName.ACCURACY)
----


== Regression model evaluation

Regression analysis is used when predicting a continuous output variable from a number of independent variables.

The full list of regression metrics supported in Apache Ignite ML is as follows:

* MAE
* R2
* RMSE
* RSS
* MSE


[source, java]
----
// Define the vectorizer.
Vectorizer<Integer, Vector, Integer, Double> vectorizer = new DummyVectorizer<Integer>()
   .labeled(Vectorizer.LabelCoordinate.FIRST);

// Define the trainer.
KNNRegressionTrainer trainer = new KNNRegressionTrainer()
    .withK(5)
    .withDistanceMeasure(new ManhattanDistance())
    .withIdxType(SpatialIndexType.BALL_TREE)
    .withWeighted(true);

// Train the model.
KNNRegressionModel knnMdl = trainer.fit(ignite, dataCache, vectorizer);

// Calculate all classification metrics.
EvaluationResult res = Evaluator
  .evaluateRegression(dataCache, mdl, vectorizer);

double mse = res.get(MetricName.MSE);
----

